LGM: Mining Frequent Subgraphs from Linear Graphs

نویسندگان

Yasuo Tabei

Daisuke Okanohara

Shuichi Hirose

Koji Tsuda

چکیده

A linear graph is a graph whose vertices are totally ordered. Biological and linguistic sequences with interactions among symbols are naturally represented as linear graphs. Examples include protein contact maps, RNA secondary structures and predicate-argument structures. Our algorithm, linear graph miner (LGM), leverages the vertex order for efficient enumeration of frequent subgraphs. Based on the reverse search principle, the pattern space is systematically traversed without expensive duplication checking. Disconnected subgraph patterns are particularly important in linear graphs due to their sequential nature. Unlike conventional graph mining algorithms detecting connected patterns only, LGM can detect disconnected patterns as well. The utility and efficiency of LGM are demonstrated in experiments on protein contact maps.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FS3: A sampling based method for top-k frequent subgraph mining

Mining labeled subgraph is a popular research task in data mining because of its potential application in many different scientific domains. All the existing methods for this task explicitly or implicitly solve the subgraph isomorphism task which is computationally expensive, so they suffer from the lack of scalability problem when the graphs in the input database are large. In this work, we pr...

متن کامل

A Two-Phase Algorithm for Differentially Private Frequent Subgraph Mining

Mining frequent subgraphs from a collection of input graphs is an important task for exploratory data analysis on graph data. However, if the input graphs contain sensitive information, releasing discovered frequent subgraphs may pose considerable threats to individual privacy. In this paper, we study the problem of frequent subgraph mining (FSM) under the rigorous differential privacy model. W...

متن کامل

Feature Selection in Frequent Subgraphs Feature Selektion auf häufigen Subgraphen

Bioinformatics is producing a wealth of network data, ranging from molecular graphs to complex gene expression networks. To distinguish different classes of graphs, such as different functional classes of proteins, one common approach is to search for common frequent subgraphs. However, this method suffers from the fact that it quickly generates thousands or even millions of frequent subgraphs....

متن کامل

Mining frequent subgraphs from ’easy’ classes

Recently, there is an increasing interest in mining structured data. Several frequent subgraph mining systems have been proposed. However, these usually consider general graphs. One can show that frequent subgraph mining for general graphs can not be performed in output-polynomial time. In practice however, data usually does not consist of arbitrary graphs but has a much simpler structure. In t...

متن کامل

Combining near-optimal feature selection with gSpan

Graph classification is an increasingly important step in numerous application domains, such as function prediction of molecules and proteins, computerised scene analysis, and anomaly detection in program flows. Among the various approaches proposed in the literature, graph classification based on frequent subgraphs is a popular branch: Graphs are represented as (usually binary) vectors, with c...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

LGM: Mining Frequent Subgraphs from Linear Graphs

نویسندگان

چکیده

منابع مشابه

FS3: A sampling based method for top-k frequent subgraph mining

A Two-Phase Algorithm for Differentially Private Frequent Subgraph Mining

Feature Selection in Frequent Subgraphs Feature Selektion auf häufigen Subgraphen

Mining frequent subgraphs from ’easy’ classes

Combining near-optimal feature selection with gSpan

عنوان ژورنال:

اشتراک گذاری